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Dr.  William  0.  Berry 
Program  Manager 
Life  Sciences  Directorate 
Air  Force  Office  of  Scientific 
Research  (AFSC) 

Bolling  Air  Force  Base,  D.C.  20332 


Re:  AFOSR  83-0215 

(Adaptive  Networks) 


Dear  Bill : 


This  Research  Progress  and  Forecast  Report  covers  the  period  from 
May  1,  1984  to  the  present.  There  have  been  a  number  of  significant 
developments  in  our  research  on  adaptive  neural  networks  using  the 
classically  conditioned  nictitating  membrane  (NM  CR)  in  rabbit  as  a  model 
system. 

As  you  know,  we  are  placing  a  greater  emphasis  on  the  theoretical 
aspects  of  learning  than  originally  proposed.  We  seek  to  develop  and 
evaluate  real-time  mathematical  models  that  have  potential  applicability 
to  learning  in  the  AI  domain.  Our  primary  goal  is  to  assess  computa¬ 
tional  versionsof  these  models  against  behavioral  and  physiological  data 
on  the  NM  CR  using  simulation  experiments.  Our  most  recent  efforts  have 
focused  on  three  classes  of  models. 

1.  Physiologically  Constrained  Sutton-Barto  Model.  This  algorithm 
was  originally  developed  by  Jonn  E.  Desmond.  With  additional  work  by 
Neil  Berthier,  and  with  advice  from  Rich  Sutton,  this  model  has  been 
rendered  into  a  form  that  can  make  predictions  about  CR  topography  and 
the  firing  pattern  of  neurons  related  to  the  CR.  The  original  version  of 
the  model  could  do  this  reasonably  well  for  the  case  of  a  single  CS  paired 
with  the  US  in  a  forward-delay  paradigm.  The  model  has  now  been  general¬ 
ized  so  that  it  can  predict  CR  topography  (or  simply  associative  strength) 
and  single-unit  physiological  data  within  complex  training  paradigms  that 
involve  two  CSs  with  independent  on-and  off-times  with  respect  to  each 
other  and  the  US. _These  paradigms  are  analogous  to  spatial  and  temporal 
credit-assignment  problems  in  reinforcement  learning.  Simulations  are  now 
performed  on  the  University's  VAXEN  computer  network,  the  same  system  used 
by  Andy  Barto  and  hi^  collaborators  in  the  Department  of  Computer  and 
Information  Science, (COINS).  A  representative  simulation  of  a  two-CS 
conditioning  experiii^ent  is  included  with  this  report. 
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2.  Physiologically  Constrained  Moore-Stickney  Model.  The  Moore- 
Stickney  model  is  a  real-time  rendering  of  an  attentional-associative 
learning  model  developed  originally  by  N.  J.  Mackintosh  in  England. 

This  model  has  been  applied  to  classical  conditioning  in  complex  (multi- 
ple-CS)  paradigms  and  to  goal-seeking  behavior.  Its  main  success  to 
date  has  been  to  describe  the  effects  of  hippocampal  lesions  in  associa¬ 
tive  learning  tasks  such  as  these.  Nestor  Schmajuk  and  I  are  completing 
a  paper  describing  simulation  experiments  with  the  latest  variant  of  the 
Moore-Stickney  model.  As  part  of  his  dissertation  project,  Schmajuk  is 
applying  the  model  to  the  problem  of  CR  topography  and  correlated 
neuronal  activity. 

3.  Real-Time  Pearce-Hall  Model.  An  alternate  to  the  Mackintosh 
type  of  attentional  learning  model  was  proposed  by  Pearce  and  Hall  in 
England.  Schmajuk  and  I  have  been  developing  real-time  computational 
variants  of  the  basic  Pearce-Hall  model  for  application  to  network 
learning  problems  involving  the  hippocampus.  The  results  of  some  simu¬ 
lation  experiments  under  this  model  have  been  submitted  for  publication 
and  additional  simulations  of  hippocampal  lesion  effects  are  described 
in  the  paper  mentioned  in  the  preceding  paragraph. 

Theoretical  activity  has  been  intense  this  past  few  months  because 
of  the  month-long  visit  by  Dr.  E.  J.  Kehoe  of  Australia.  Kehoe  and  his 
collaborators  have  generated  much  of  the  behavioral  data  from  the  NM  CR 
preparation  on  multiple-CS  effects.  His  visit  in  September  set  the 
occasion  for  a  series  of  seven  seminars.  Computational  Learning  Models 
and  Test  Beds  with  presentations  by  Barto,  Desmond,  Kehoe,  Moore  and 
Schmajuk  with  significant  contributions  from  Neil  Berthier  and 
Rich  Sutton. 

Since  Kehoe's  departure,  we  have  met  formally  to  consider  prediction 
from  several  models  regarding  an  experiment  performed  by  my  colleague, 

Joe  Ayres,  with  relevance  for  real-time  models.  Ayre's  experiment 
concerned  the  effects  of  extending  a  CS  in  time  beyond  the  reinforcing 
event.  Sutton,  Desmond,  and  Schmajuk  discussed  the  question  from  the  view¬ 
point  of  several  models,  including:  the  basic  Sutton-Barto  Model,  a 
variant  of  the  Sutton-Barto  Model  designated  by  Sutton  as  the  Adaptative 
Heuristic  Critique  with  Discount,  Desmond's  physiologically  constrained 
version  of  the  Sutton-Barto  Model,  and  Moore-Stickney  Model  of  attentional- 
associative  networks.  These  formal  meetings  have  been  extremely  valuable 
because  they  (a)  enhance  the  groups'  understanding  of  the  various  models, 
(b)  familiarize  colleagues  and  students  with  computational  approaches  to 
associative  learning,  and  (c)  generate  ideas  for  experiments  that  can 
provide  clear  cut  discriminations  among  models.  In  this  regard,  Kehoe  and 
I  plan  to  conduct  some  behavioral  experiments  relative  to  these  models  in 
parallel  in  our  respective  laboratories. 

Most  of  our  theoretical  work  has  relied  on  the  experimental  literature 
on  behavioral  aspects  of  the  NM  CR,  i.e.,  on  measures  of  associative 
strength  and  on  CR  topography.  Neuronal  schemas  that  represent  the  various 
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brain  systems  that  might  perform  a  given  computational  function  within  a 
given  model  are  based  primarily  on  data  from  lesion  studies.  For  example, 
in  the  Moore-Stickney  Model  the  hippocampal  formation  is  portrayed  as  a 
system  that  reduced  the  salience  or  associabil ity  of  stimuli  depending  on 
the  status  of  the  associative  network  underlying  behavior,  an  assumption 
based  on  observed  effects  of  hippocampal  lesions.  The  behavioral/biolog¬ 
ical  foundations  of  such  schemas  will  improve  as  relevant  physiological 
evidence  begins  to  surface.  It  is  for  this  reason  that  we  are  accumulat¬ 
ing  single-unit  recordings  concurrently  with  behavioral  testing.  Success 
in  this  domain  demands  that  physiological  recordings  include  neurons 
within  brain  structures  known  to  be  essential  for  the  NM  CR.  (Our  Annual 
Technical  Report  dated  May  20,  1984  covers  these  points  more  ful ly) .In 
addition,  it  is  necessary  to  have  computer-assisted  statistical  techniques 
that  can  quantify  the  relationship  between  neuronal  firing  and  the  fine- 
grain  details  of  the  behavioral  CR.  John  E.  Desmond  has  developed  an 
impressive  arsenal  of  quantitative  tools  for  this  purpose.  Examples  are 
included  with  this  report.  These  same  quantitative  tools  can  be  extended 
to  quantification  of  the  relationship  between  theory-generated  neuronal 
firing  and  the  activity  of  actual  CR-related  neurons. 

There  has  been  an  important  development  concerning  brain  regions 
essential  for  the  NM  CR.  Contrary  to  published  reports  as  recent  as  last 
year,  it  now  appears  that  cerebellar  cortex  as  well  as  deep  nucleus  inter- 
positus  is  essential  for  the  NM  CR.  This  came  to  light  from  lesion  experi¬ 
ments  by  Yeo,  Hardiman,  and  Glickstein  at  University  College  London,  a 
group  with  whom  I  have  collaborated  for  several  years.  A  lesion  confined 
to  a  portion  of  cerebellar  cortex  known  to  anatomists  as  the  simplex  lobe 
(a.k.a.  hemispheric  lobule  VI,  or  HVI ,  for  short)  profoundly  disruptsa 
previously  acquired  NM  CR  while  having  no  effect  on  the  underlying  reflex 
(UR).  In  short,  lesion  of  HVI,  but  no  other  portion  of  cerebellar  cortex, 
produces  the  same  deficit  as  does  lesions  of  "downstream"  premotor  compo¬ 
nents  of  the  cerebello-rubro-circuit  essential  for  the  CR. 


I  was  able  to  examine  relevant  histological  material  and  data  during 
a  visit  to  the  London  laboratory  this  past  June,  and  we  have  reproduced 
the  effect  of  HVI  lesions  in  our  laboratory.  These  lesion  results,  and 
other  anatomical  considerations,  suggest  that  neurons  within  HVI  are 
crucial  for  the  learning  and  execution  of  the  NM  CR.  We  have  recently 
begun  recording  from  single  neurons  of  HVI  during  behavioral  testing. 

We  are  especially  interested  in  the  relationship  between  the  Purkinje 
cell  activity  and  the  CR  because  Purkinje  cells  are  featured  prominently 
in  several  mathematical  models  of  cerebellar  learning  (e.g.,  Albus, 
Gilbert,  Marr).  Photocopies  of  the  experimental  set  up  for  recording 
from  Purkinje  cells  and  sample  data  obtained  by  Neil  Berthier  are 
included  with  this  report. 
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In  order  to  establish  the  involvement  of  Purkinje  cells  in  the 
NM  CR,  it  is  necessary  to  record  from  them  during  behavioral  testing. 
There  are  a  number  of  questions  we  intend  to  pursue: 

1.  Is  there  a  relationship  between  Purkinje  cell  activity  and 
CR  topography? 

2.  What  is  the  nature  of  this  relationship  and  how  well  is  it 
described  by  computational  models? 

3.  Are  CR  related  Purkinje  cells  limited  to  HVI,  or  do  they  exist 
in  other  regions  of  the  cerebellar  cortex? 

4.  Assuming  that  Purkinje  cells  are  causally  implicated  in  the  NM 
CR,  precisely  how  do  they  influence  other  premotor  components  of  the  CR, 
e.g.,  nucleus  interpositus ,  red  nucleus,  and  supratrigeminal  reticular 
formation? 

I  can  provide  more  detailed  information  regarding  any  aspect  of  this 
report  immediately  on  request. 


Cordially, 

QcrL,  Ur)yLe9^x_ 

John  W.  Moore,  Ph.D. 

Professor  of  Psychology 
(Neuroscience  and  Behavior) 
Associated  Professor  of 
Computer'and  Information  Sciences 
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FTOURE  _J-  •  Photoqraph  of  riibb  1 1  in  recording  apparatus.  Phis 
animal  has  been  prepared  for  brain  stem  recordings  with  a  hoi  low 
pedestal  tl.rou.cih  which  recording  electrodes  are  introduced  into  the 
b  r  a  i  n  . 

F;  T  Gi.lRE  3  .  Representative  CRT  tracings  <100  ms/rii  v  >  fro® 
single  conditioning  trials  snowing  (upper  traces)  NM  acti vitv  and 
(lower  traces;  single  SR  units.  A.  An  on-unit  from  Animal  323  to  a 
noise  CS-i  .  Note  increase  in  firing  about  100  ms  bet  ore  the  CR.  B, 
Another  on-unit  in  Animal  32 B  snowing  increased  act i vi tv  concurrent 
with  the  lR  to  a.  tone  CS— .  C.  An  off— unit  in  Animal  17  to  a  tone  C5. 
Note  decrease  in  firing  approx i macelv  50  ms  before  CR.  D.  The  same 
unit  as  in  C«  but  on  an  extinction  trial  with  no  CR.  E.  An  on-unit 

in  Animal  6  showing  an  increase  in  activity  concurrent  wit.n  the  CR 
to  a  noise  CE+.  F.  The  same  unit  as  in  E  during  a  trial  wi to  a  tone 
CS—  and  no-  CR. 

F  3  ntiRE  "3  .  Representative  computer  generated  pc-:ri  stimulus— 1 1  me 
histograms  and  averaged  NM  activity.  These  are  based  on  software  tor 
the  Ad  pi  a.  IT  developed  by  r .  R.  Solomon  and  his  colleagues  at. 

Wi  i  1  ;i  a«s  College.  Vertical  bars  indicate  the  CS— US  interval:  Tne 
left-hand  bar  denotes  CS  onset.  The  right-hand  bar  denotes  US  onset , 
if  It  occurs;  it  is  shown  for  nenroirvioresd  trials  to  ease 
comparisons.  A.  Summary  of  21  CS+  trials  with  CEs  for  Animal  :y~.B 
showing  a  typical  on-urn  t.  The  dots  indicate  statistical  3  v 
significant  counts  in  relation  to  ore-CS  activity.  The  program 
provides  a  menu  of  tests  <c  or  binomial;,,  significance  levels  is.tj.  , 
.05),  and  null  Hypothesis  rejection  regions  (1  or  2  tails;.  b.  The 
same  unit  as  in  A.  but  on  15  CS-  trials  with  no  CRs.  C.  An  off -uni  t 
in  Animal  17  during  16  CS+  trials,  D.  The  same  unit  as  in  C  on  10 
CS-  tri al s. 

■  A.  Per  i  sti  mul  us— t  i  me  histogram  and  a  vara  god  MM 
response  over  33  CS+  trials  in  Animal  02A.  B.  The  same  unit  as  in  A 
over  12  CS-  extinction  trials  with  no  CRs.  Panels  h  and  B  are  simply 
reminders  to  assist  understanding  Panels  C  and  D.  C. 

Ti  me-correl  oqrarn  based  on  the  data  in  Fig.  3A.  D.  Ti  me— correl  oqram 
based  ori  the  data  in  Fig.  3C. 

FT  CURE  Column  1j  Second-order  conditioning  using  two  350 

ms  CSs  and  no  US.  CSt  offset  is  contiguous  with  CS2  onset.  Initial 
V's:  CS2  =  0.9,  CS1  =0.  (A).  Y  as  a  function  of  time  for  a  trial 

in  which  CSs  are  serially  presented.  The  function  is  a  result  of  50 
conditioning  trials.  Hash  marks  on  the  abscissa  represent,  from  left 
to  right,  CS1  onset,  CS2  onset.  CS2  offset.  (B) .  V  as  a  function  of 
trials  for  CS1  and  CS2,  plotted  over  50  trials.  (C) .  Per i st i mul us 
time  hiv.togram  of  simulated  neuronal  activity  accumulated  over  50 
CS1-CS2  presentations.  Hash  marks  on  abscissa  are  the  same  as  in  A. 
Column  2:  Serial  compound  conditioning  using  a  700  ms  CS1 ,  350  ms 
CG2 ,  and  30  ms  US.  CS1  and  CS?  cot.  or  mi  nate,  and  their  offset  is 
contiguous  with  US  onset.  Initial  V  -  0  for  both  CSs .  <D> .  v  as  a 

function  of  time  i or  a  compound  stimulus  presentation  after  lvO 
trials.  Hash  marks  on  abscissa  represent,  from  left  to  riaht.  CS1 
onset,  CS2  onset,  US  onset  (and  compound  offset).  <E) .  V  as  a 
function  of  trials  for  CS1  and  CS2,  plotted  over  100  trials.  (F). 


Pen  st  i  mu  J us  tune  lustoqram  of  s  1  mu  1  :n  ed  neuronal  activity 
accumulated  over  loo  compound  stimulus  presentations.  Abscissa  hash 
marks  nr  »»  identical  to  those  in  0. 

F:  r CIUA'P .  Conditioned  inhibition,  in  which  two  trial  type-- 
are  ur  eat.  n  Led .  I-'or  the  first  type,  a.  reinrorcnrl  compound  .  C3+ , 
coii'ai  af>  a  si  mui  v.aneous  pro  sen  t  a  t  i  on  of  CSi  and  iff,  the  US  c;n  -o  ' 
i  a  com.  j  ijuu'S  with  the  compound  offset .  "i  he  second  tri  a. 5  r  vpe 
con  si  s>:s  or  an  unnd  nf  orceci  presentation  of  C32  alone.  lS  1  and  u*j2 
art-  boi-h  C'lO  ,ira  in  duration.  US  is  30  ms.  in  duration.  Initial  V  a 
.-•re  w  or  both  OS's.  (A).  Y  as  a  function  of  time  tor  botn  trial 
t  vf  .  I  iash  narks  on  the  abscissa  represent ,  trom  let;  to  r;  tint:  ,  tnr? 
onset  and  one  of+set  of  either  CS+  or  C3— .  <B).  V  as  a  function  o-,- 
trial  =>  •>'  or  CS+  and  CS— »  plotted  over  i  00  trials.  <C->  .  Per  is  to  mui  us 

lime  !  1 1  stop  ram  of  si.  mui  ated  neuronal  activity  to  CS-  present  at  i  a  no. 
Spile  cDum  s  «>.*&  accumulated  over  SO  trials  startin'!  al  ter  the 
mil  j  a. I  trials  of  training.  Abscissa  hash  marks  are  as  •  nth.  cni  ed 

in  h.  ui;.  Peri  st  i  mu  l  us  time  histogram  of  simulated  neuronal, 
activity  to  CS+  presentations.  Spike  counts  are  accumulated  as 
ci  e  s  c  r  i  b  e  d  J  n  Cl- , 
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UB:  32BN+  CR  TRIAL3*21  CS=350  MS 
IN—  1  0  MS  V .  CAL .  — .  2  CNTS,  3  VOLTS 

T  TEST  WAS  USED 

.05  2-TAILED  DF=34 

i.EAD  TIME -50  MS 

•:  BINS  OMITTED 

CR  ONSET  AT  57.  MAX  AMP 

NEURAL  ONSET  AT  3IG.  BIN  #  9 

MEAN  SPIKES/ TRIAL  DEPICTED 


SUB:  32BT-  NO  CR  TRIALS* 15  CS=350  M 

BIN=10  MS  V.CAL.=.2  CNTS.  3  VOLTS 

T  TEST  WAS  USED 

.05  2-TAILED  DF=34 

LEAD  TIME* -370  MS 

0  BINS  OMITTED 

CR  ONSET  AT  57.  MAX  AMP 

NEURAL  ONSET  AT  SIG.  BIN  #  1 

MEAN  SPIKES /TRIAL  DEPICTED 
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T  TEST  WAS  USED  0  BINS  OMITTED 

: 3  2- TA I LED  QF=34  MEAN  SPIKES/ TRIAL  DEPICTED 

•  .SAO  "ME  =  100  MS 
'  BINS  OMITTED 
OR  ONSET  AT  37.  MAX  AMP 
-SURAL  ONSET  AT  SIG.  BIN  #  5 
MEAN  SR ! 1  E5/ TRIAL  DEPICTED 
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MEMORANDUM 

from _ J.Qhfk.MQQfrS. . . . oate . Septemben.  J0-, -.1984- 

to _ J..hlM^Atld..F.dAtizs. . - . 

subject. ..Cojnpu.tatianal..Leanning.Mcdels..and-Test-Beds . 


J. im  Kehoe  is  visiting  ion  one  month.  Hz  will  bz  giving  three  lectures  on  his 
research  into  Compound  Stimulus  Conditioning.  His  research  provides  data  ion  deve¬ 
loping  and  evaluating  computational  models  oi  conditioning ,  with  special  emphasis  on 
CR  topognaphy. 

We  anz  scheduling  a  sznizs  oi  lunch  time  lectunzs  beginning  Monday,  Szptzmbzn  10, 
1984.  Othzn  talks  on  modelling  CR  topognaphy  and/ on  nelatzd  topics  anz  also  scheduled. 

A  tentative  schedule  appeans  below:  Time:  Noon 

Place:  Room  101,  Middlesex.  House 


1.  Monday,  Septemben  10,  1984 

Jim  Kehoe,  "Senial  Compounds' ' 

2.  Wednesday,  Septen}ben  12,  1984 

Jim  Kehoe,  "Combination  Rules" 

3.  Monday,  Septemben  17,  1984 

John  Desmond,  "CR  Topognaphy  and  Neurophysiological  Connelates" 

A.  Wednesday,  Septemben  19,  1984 

" Introduction  to  Attentional  Theories",  with  John  Mo  one 

5.  Monday,  Septemben  24,  1984 

Neston  ScJimajuk,  "Real-Time  Attentional  Models  and  CR  Topognaphy" 

6.  Wednesday,  Septemben  26,  1984 

Andy  Baiio,  "Maybe  Layered  Networks" 

7.  Monday,  October  1,  1984 

Jim  Kehoe,  "Goodbye  and  Lots  oi  Luck" 
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UNIVERSITY  OF  MASSACHUSETTS 
AMHERST 

MEMORANDUM 

John..  Moore . date . 20..Nav£mber...l984. 

Ayres,.  .Bart.Q.fc..8er.thier.,...Desmond<..aiid..Sdmjuk. . 

Meet  i.19.  .9.1...  1.4. .  .N.py.?mb.e  r. .  .1.9.84 . 


1.  Ayres  presented  the  rationale  and  outcome  of  an  experiment  relevant  to 
real-time  learning  theories  of  interest  to  this  group.  A  draft  manu¬ 
script  entitled,  "Extending  CS  Beyond  vs.  Prior  to  Reinforcement"  is 
attached. 

2.  Sutton  reported  that  the  data  from  Ayres'  study  are  basically  consistant 
with  the  original  Sutton-Barto  model.  However,  the  derivative  model. 
Adaptive  Heuristic  Critic  with  Discount  (AHC-D),  predicts  an  asymmetrical 
outcome  such  that  extension  of  a  CS  beyond  the  US  causes  a  greater  reduc¬ 
tion  of  associative  value  than  an  equal  extension  in  a  forward  direction. 

3.  Schmajuk  presented  the  jresults  of  simulation  experiments  with  a  version 
of  the  Moore-Stickney  'described  in  a  draft-manuscript  entitled,  "Varia¬ 
tions  of  CS  Effectiveness  Revisited:  Modified  Attentional  Models  of 
Classical  Conditioning".  This  model,  as  does  ACH-D,  predicts  the  asym¬ 
metry  whereby  post-US  extension  of  a  CS  causes  greater  reductions  of 
associative  value  than  does  forward  extension  of  the  CS. 

4.  Desmond  presented  simulations  from  the  physiologically  constrained 
version  of  the  Sutton-Barto  model  that  he  developed  with  Neil  Berthier's 
help.  These  simulations  also  indicated  asymmetries  as  noted  above. 

The  group  seemed  to  agree  that: 
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1.  The  Ayres'  experiment  was-  indeed  relevant  to  real-time  models. 

2.  A  simpler  experiment  with  the  "blocked"  CS  would  likely  yield  different 
anticipated  results  than  those  under  consideration  here. 

3.  Anticipated  outcomes  from  the  point  of  view  of  the  various  models  depend 
critically  on  parameters.  Thus,  for  example,  simulated  experiments  on 
CS  extensions  before  and  after  a  CS  depend  on  the  duration  of  the  CS, 
knowledge  of  the  optimal  ISI,  and/or  rates  of  recruitment  and  decay  of 
theoretical  variables.  These  considerations  should  guide  choice  of 
designs  and  parameters  of  real  experiments. 


